Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 9.466
Filtrar
1.
Sensors (Basel) ; 24(7)2024 Mar 22.
Artigo em Inglês | MEDLINE | ID: mdl-38610256

RESUMO

The ongoing biodiversity crisis, driven by factors such as land-use change and global warming, emphasizes the need for effective ecological monitoring methods. Acoustic monitoring of biodiversity has emerged as an important monitoring tool. Detecting human voices in soundscape monitoring projects is useful both for analyzing human disturbance and for privacy filtering. Despite significant strides in deep learning in recent years, the deployment of large neural networks on compact devices poses challenges due to memory and latency constraints. Our approach focuses on leveraging knowledge distillation techniques to design efficient, lightweight student models for speech detection in bioacoustics. In particular, we employed the MobileNetV3-Small-Pi model to create compact yet effective student architectures to compare against the larger EcoVAD teacher model, a well-regarded voice detection architecture in eco-acoustic monitoring. The comparative analysis included examining various configurations of the MobileNetV3-Small-Pi-derived student models to identify optimal performance. Additionally, a thorough evaluation of different distillation techniques was conducted to ascertain the most effective method for model selection. Our findings revealed that the distilled models exhibited comparable performance to the EcoVAD teacher model, indicating a promising approach to overcoming computational barriers for real-time ecological monitoring.


Assuntos
Fala , Voz , Humanos , Acústica , Biodiversidade , Conhecimento
2.
J Acoust Soc Am ; 155(4): 2603-2611, 2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38629881

RESUMO

Open science practices have led to an increase in available speech datasets for researchers interested in acoustic analysis. Accurate evaluation of these databases frequently requires manual or semi-automated analysis. The time-intensive nature of these analyses makes them ideally suited for research assistants in laboratories focused on speech and voice production. However, the completion of high-quality, consistent, and reliable analyses requires clear rules and guidelines for all research assistants to follow. This tutorial will provide information on training and mentoring research assistants to complete these analyses, covering areas including RA training, ongoing data analysis monitoring, and documentation needed for reliable and re-creatable findings.


Assuntos
Distúrbios da Voz , Voz , Humanos , Acústica , Fala
3.
Sci Rep ; 14(1): 8977, 2024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38637516

RESUMO

Why do we prefer some singers to others? We investigated how much singing voice preferences can be traced back to objective features of the stimuli. To do so, we asked participants to rate short excerpts of singing performances in terms of how much they liked them as well as in terms of 10 perceptual attributes (e.g.: pitch accuracy, tempo, breathiness). We modeled liking ratings based on these perceptual ratings, as well as based on acoustic features and low-level features derived from Music Information Retrieval (MIR). Mean liking ratings for each stimulus were highly correlated between Experiments 1 (online, US-based participants) and 2 (in the lab, German participants), suggesting a role for attributes of the stimuli in grounding average preferences. We show that acoustic and MIR features barely explain any variance in liking ratings; in contrast, perceptual features of the voices achieved around 43% of prediction. Inter-rater agreement in liking and perceptual ratings was low, indicating substantial (and unsurprising) individual differences in participants' preferences and perception of the stimuli. Our results indicate that singing voice preferences are not grounded in acoustic attributes of the voices per se, but in how these features are perceptually interpreted by listeners.


Assuntos
Música , Canto , Voz , Humanos , Qualidade da Voz , Acústica
4.
PLoS One ; 19(4): e0301336, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38625932

RESUMO

Recognizing the real emotion of humans is considered the most essential task for any customer feedback or medical applications. There are many methods available to recognize the type of emotion from speech signal by extracting frequency, pitch, and other dominant features. These features are used to train various models to auto-detect various human emotions. We cannot completely rely on the features of speech signals to detect the emotion, for instance, a customer is angry but still, he is speaking at a low voice (frequency components) which will eventually lead to wrong predictions. Even a video-based emotion detection system can be fooled by false facial expressions for various emotions. To rectify this issue, we need to make a parallel model that will train on textual data and make predictions based on the words present in the text. The model will then classify the type of emotions using more comprehensive information, thus making it a more robust model. To address this issue, we have tested four text-based classification models to classify the emotions of a customer. We examined the text-based models and compared their results which showed that the modified Encoder decoder model with attention mechanism trained on textual data achieved an accuracy of 93.5%. This research highlights the pressing need for more robust emotion recognition systems and underscores the potential of transfer models with attention mechanisms to significantly improve feedback management processes and the medical applications.


Assuntos
Emoções , Voz , Masculino , Humanos , Fala , Linguística , Reconhecimento Psicológico
5.
Nat Commun ; 15(1): 1873, 2024 Mar 12.
Artigo em Inglês | MEDLINE | ID: mdl-38472193

RESUMO

Voice disorders resulting from various pathological vocal fold conditions or postoperative recovery of laryngeal cancer surgeries, are common causes of dysphonia. Here, we present a self-powered wearable sensing-actuation system based on soft magnetoelasticity that enables assisted speaking without relying on the vocal folds. It holds a lightweighted mass of approximately 7.2 g, skin-alike modulus of 7.83 × 105 Pa, stability against skin perspiration, and a maximum stretchability of 164%. The wearable sensing component can effectively capture extrinsic laryngeal muscle movement and convert them into high-fidelity and analyzable electrical signals, which can be translated into speech signals with the assistance of machine learning algorithms with an accuracy of 94.68%. Then, with the wearable actuation component, the speech could be expressed as voice signals while circumventing vocal fold vibration. We expect this approach could facilitate the restoration of normal voice function and significantly enhance the quality of life for patients with dysfunctional vocal folds.


Assuntos
Distúrbios da Voz , Voz , Dispositivos Eletrônicos Vestíveis , Humanos , Prega Vocal/fisiologia , Qualidade de Vida , Voz/fisiologia
6.
Sensors (Basel) ; 24(5)2024 Feb 25.
Artigo em Inglês | MEDLINE | ID: mdl-38475029

RESUMO

In recent years, there has been a notable rise in the number of patients afflicted with laryngeal diseases, including cancer, trauma, and other ailments leading to voice loss. Currently, the market is witnessing a pressing demand for medical and healthcare products designed to assist individuals with voice defects, prompting the invention of the artificial throat (AT). This user-friendly device eliminates the need for complex procedures like phonation reconstruction surgery. Therefore, in this review, we will initially give a careful introduction to the intelligent AT, which can act not only as a sound sensor but also as a thin-film sound emitter. Then, the sensing principle to detect sound will be discussed carefully, including capacitive, piezoelectric, electromagnetic, and piezoresistive components employed in the realm of sound sensing. Following this, the development of thermoacoustic theory and different materials made of sound emitters will also be analyzed. After that, various algorithms utilized by the intelligent AT for speech pattern recognition will be reviewed, including some classical algorithms and neural network algorithms. Finally, the outlook, challenge, and conclusion of the intelligent AT will be stated. The intelligent AT presents clear advantages for patients with voice impairments, demonstrating significant social values.


Assuntos
Faringe , Voz , Humanos , Som , Algoritmos , Redes Neurais de Computação
7.
Rev. logop. foniatr. audiol. (Ed. impr.) ; 44(1): [100330], Ene-Mar, 2024. ilus, tab
Artigo em Inglês | IBECS | ID: ibc-231906

RESUMO

Introduction: To use a test in a language or culture other than the original it is necessary to carry out, in addition to its adaptation, a psychometric validation. This systematic review assesses the validation studies of the voice self-report scales in Spanish. Methods: A systematic review was performed searching ten databases. The assessment was carried out following the criteria proposed by Terwee et al. (2007) together with some specifically proposed for this study. Validation studies in Spanish of self-report voice scales published in indexed journals were included and the search was updated on February 2nd, 2023. Results: 15 studies that evaluated 12 scales were reviewed. It was verified that not all the validations were adjusted to the criteria used and that the properties to verify the metric strength of the validations were, in general, few. Conclusions: This systematic review shows that the included studies do not report much evidence of metric quality. It should be considered that different strategies have currently been developed to obtain more and better evidence of reliability and validity. Our contribution is to reflect on the usual practice of validation of self-report scales in Spanish language. The most important weakness is the possibility of using broader and more current evaluation protocols. We also propose to continue this work, completing it with a meta-analytic study.(AU)


Introducción: Para utilizar una prueba en una lengua o cultura distinta de la original es preciso realizar, además de su adaptación, una validación psicométrica. Esta revisión sistemática valora los estudios de validación de las escalas de autoinforme de voz en español. Método: Se realizó una revisión sistemática buscando en diez bases de datos. La valoración se llevó a cabo siguiendo los criterios propuestos por Terwee et al. (2007) junto con algunos específicamente propuestos para este trabajo. Se incluyeron estudios de validación en español de escalas de autoinforme publicados en revistas indexadas. La última búsqueda fue realizada el 2 de febrero de 2023. Resultados: Se revisaron 15 trabajos que evaluaron 12 escalas. Se comprobó que no todas las validaciones se ajustaron a los criterios utilizados y que las propiedades para comprobar la robustez métrica de estas fueron, por lo general, pocas.Conclusiones: Esta revisión sistemática muestra que los estudios incluidos no reportan demasiada evidencia de calidad métrica. Debería considerarse que en la actualidad se han desarrollado diferentes estrategias para obtener más y mejor evidencia de fiabilidad y validez. Nuestra contribución ha sido valorar la práctica de la validación de las escalas de autoinforme en lengua española. La más importante debilidad es la posibilidad de usar algún protocolo más amplio y actual. También proponemos continuar este trabajo con un estudio metaanalítico.(AU)


Assuntos
Humanos , Masculino , Feminino , Voz , Psicometria , Fonoaudiologia , Autorrelato
8.
J Dr Nurs Pract ; 17(1): 3-10, 2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38538113

RESUMO

Background: Many health professionals report feeling uncomfortable talking with patients who hear voices. Patients who hear voices report feeling a lack of support and empathy from emergency nurses. A local emergency department reported a need for training for nurses in the care of behavioral health patients. Objective: The aim of this study is to implement a quality improvement project using a hearing voices simulation. Empathy was measured using the Toronto Empathy Questionnaire, and a post-intervention survey was used to evaluate emergency nurses' perception of the professional development session. Methods: The quality improvement project included the implementation of a hearing voices simulation with emergency nurses. A paired t-test was used to determine the differences in the nurses empathy levels pre-and post-simulation. Qualitative data was collected on the nurses' experience during the simulation debriefing. A Likert-style questionnaire was used to collect data on the nurses' evaluation of the simulation. Results: The results of the hearing voices simulation were a statistically significant increase (p < .00) in empathy from baseline (M = 47.95, SD = 6.55) to post-intervention empathy scores (M = 48.93, SD = 6.89). The results of the post-simulation survey indicated that nurses felt that the hearing voices simulation was useful (n = 100; 98%) and helped them to feel more empathetic toward patients who hear voices (n = 98; 96%). Conclusions: Using a hearing voices simulation may help emergency nurses feel more empathetic toward the behavioral health patients who hear voices. Implications for Nursing: Through the implementation of a hearing voices simulation, clinical staff educators can provide support to staff nurses in the care of behavioral health patients.


Assuntos
Empatia , Voz , Humanos , Alucinações , Emoções , Audição
9.
JAMA ; 331(15): 1259-1261, 2024 04 16.
Artigo em Inglês | MEDLINE | ID: mdl-38517420

RESUMO

In this Medical News article, Edward Chang, MD, chair of the department of neurological surgery at the University of California, San Francisco Weill Institute for Neurosciences joins JAMA Editor in Chief Kirsten Bibbins-Domingo, PhD, MD, MAS, to discuss the potential for AI to revolutionize communication for those unable to speak due to aphasia.


Assuntos
Afasia , Inteligência Artificial , 60453 , Fala , Voz , Humanos , Fala/fisiologia , Voz/fisiologia , Qualidade da Voz , Afasia/etiologia , Afasia/terapia , Equipamentos e Provisões
10.
JASA Express Lett ; 4(3)2024 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-38426889

RESUMO

The discovery that listeners more accurately identify words repeated in the same voice than in a different voice has had an enormous influence on models of representation and speech perception. Widely replicated in English, we understand little about whether and how this effect generalizes across languages. In a continuous recognition memory study with Hindi speakers and listeners (N = 178), we replicated the talker-specificity effect for accuracy-based measures (hit rate and D'), and found the latency advantage to be marginal (p = 0.06). These data help us better understand talker-specificity effects cross-linguistically and highlight the importance of expanding work to less studied languages.


Assuntos
Percepção da Fala , Voz , Humanos , Idioma , Reconhecimento Psicológico
11.
BMJ Open ; 14(2): e076998, 2024 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-38401896

RESUMO

INTRODUCTION: Over the past decade, several machine learning (ML) algorithms have been investigated to assess their efficacy in detecting voice disorders. Literature indicates that ML algorithms can detect voice disorders with high accuracy. This suggests that ML has the potential to assist clinicians in the analysis and treatment outcome evaluation of voice disorders. However, despite numerous research studies, none of the algorithms have been sufficiently reliable to be used in clinical settings. Through this review, we aim to identify critical issues that have inhibited the use of ML algorithms in clinical settings by identifying standard audio tasks, acoustic features, processing algorithms and environmental factors that affect the efficacy of those algorithms. METHODS: We will search the following databases: Web of Science, Scopus, Compendex, CINAHL, Medline, IEEE Explore and Embase. Our search strategy has been developed with the assistance of the university library staff to accommodate the different syntactical requirements. The literature search will include the period between 2013 and 2023, and will be confined to articles published in English. We will exclude editorials, ongoing studies and working papers. The selection, extraction and analysis of the search data will be conducted using the 'Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for scoping reviews' system. The same system will also be used for the synthesis of the results. ETHICS AND DISSEMINATION: This scoping review does not require ethics approval as the review solely consists of peer-reviewed publications. The findings will be presented in peer-reviewed publications related to voice pathology.


Assuntos
Distúrbios da Voz , Voz , Humanos , Distúrbios da Voz/diagnóstico , Algoritmos , MEDLINE , Aprendizado de Máquina , Revisões Sistemáticas como Assunto , Literatura de Revisão como Assunto
12.
JASA Express Lett ; 4(2)2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-38350076

RESUMO

Human voice directivity shows horizontal asymmetries caused by the shape of the lips or the position of the tooth and tongue during vocalization. This study presents and analyzes the asymmetries of voice directivity datasets of 23 different phonemes. The asymmetries were determined from datasets obtained in previous measurements with 13 subjects in a surrounding spherical microphone array. The results show that asymmetries are inherent to human voice production and that they differ between the phoneme groups with the strongest effect on the [s], the [l], and the nasals [m], [n], and [ŋ]. The least asymmetries were found for the plosives.


Assuntos
Voz , Humanos , Língua
13.
Eur Arch Otorhinolaryngol ; 281(5): 2707-2716, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38319369

RESUMO

PURPOSE: This cross-sectional study aimed to investigate the potential of voice analysis as a prescreening tool for type II diabetes mellitus (T2DM) by examining the differences in voice recordings between non-diabetic and T2DM participants. METHODS: 60 participants diagnosed as non-diabetic (n = 30) or T2DM (n = 30) were recruited on the basis of specific inclusion and exclusion criteria in Iran between February 2020 and September 2023. Participants were matched according to their year of birth and then placed into six age categories. Using the WhatsApp application, participants recorded the translated versions of speech elicitation tasks. Seven acoustic features [fundamental frequency, jitter, shimmer, harmonic-to-noise ratio (HNR), cepstral peak prominence (CPP), voice onset time (VOT), and formant (F1-F2)] were extracted from each recording and analyzed using Praat software. Data was analyzed with Kolmogorov-Smirnov, two-way ANOVA, post hoc Tukey, binary logistic regression, and student t tests. RESULTS: The comparison between groups showed significant differences in fundamental frequency, jitter, shimmer, CPP, and HNR (p < 0.05), while there were no significant differences in formant and VOT (p > 0.05). Binary logistic regression showed that shimmer was the most significant predictor of the disease group. There was also a significant difference between diabetes status and age, in the case of CPP. CONCLUSIONS: Participants with type II diabetes exhibited significant vocal variations compared to non-diabetic controls.


Assuntos
Diabetes Mellitus Tipo 2 , Voz , Humanos , Qualidade da Voz , Acústica da Fala , Diabetes Mellitus Tipo 2/complicações , Estudos Transversais , Medida da Produção da Fala , Acústica
14.
JAMA Otolaryngol Head Neck Surg ; 150(4): 283-284, 2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38386315

RESUMO

This Viewpoint discusses the need to create standards for audiomics to identify unique audio biomarkers of health and disease­now possible because of more efficient voice data analysis available through the use of artificial intelligence (AI)­and to improve patient care.


Assuntos
Voz , Humanos
15.
Nurs Open ; 11(2): e2101, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38391105

RESUMO

AIM: Discussing the nurses' voice behaviour could support the managers in making the right decisions and solve problems. DESIGN: This was a discursive paper. METHODS: The discursive was based on reviewing the literature. RESULTS: Nurses play a critical role in offering useful constructive advice, which leads to management figuring out and solving problems immediately for the purpose of bettering the working environment. Therefore, we assert that trust in leadership and the leader-leader exchange system also plays a critical role in enforcing voice behaviour. Trust is a crucial aspect of voice behaviour, and integrated trust in leadership and leader-leader exchange as a possible practical suggestion for the fostering of voice behaviour are proposed. Nurse managers must maintain a sense of reciprocal moral obligation in order to nurture value-driven voice behaviour. It is important that open dialogue, active listening and trust in leadership exist. Nurse managers must consider ways to foster mutual trust, and support and enable nurses to use voice behaviour in everyday practice.


Assuntos
Relações Interprofissionais , Liderança , Enfermeiras e Enfermeiros , Confiança , Voz , Humanos , Enfermeiras Administradoras
16.
J Acoust Soc Am ; 155(2): 1071-1085, 2024 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-38341737

RESUMO

Children's speech understanding is vulnerable to indoor noise and reverberation: e.g., from classrooms. It is unknown how they develop the ability to use temporal acoustic cues, specifically amplitude modulation (AM) and voice onset time (VOT), which are important for perceiving distorted speech. Through three experiments, we investigated the typical development of AM depth detection in vowels (experiment I), categorical perception of VOT (experiment II), and consonant identification (experiment III) in quiet and in speech-shaped noise (SSN) and mild reverberation in 6- to 14-year-old children. Our findings suggested that AM depth detection using a naturally produced vowel at the rate of the fundamental frequency was particularly difficult for children and with acoustic distortions. While the VOT cue salience was monotonically attenuated with increasing signal-to-noise ratio of SSN, its utility for consonant discrimination was completely removed even under mild reverberation. The reverberant energy decay in distorting critical temporal cues provided further evidence that may explain the error patterns observed in consonant identification. By 11-14 years of age, children approached adult-like performance in consonant discrimination and identification under adverse acoustics, emphasizing the need for good acoustics for younger children as they develop auditory skills to process distorted speech in everyday listening environments.


Assuntos
Percepção da Fala , Voz , Adulto , Criança , Humanos , Adolescente , Ruído/efeitos adversos , Acústica , Fala
17.
J Acoust Soc Am ; 155(2): 1253-1263, 2024 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-38341748

RESUMO

The reassigned spectrogram (RS) has emerged as the most accurate way to infer vocal tract resonances from the acoustic signal [Shadle, Nam, and Whalen (2016). "Comparing measurement errors for formants in synthetic and natural vowels," J. Acoust. Soc. Am. 139(2), 713-727]. To date, validating its accuracy has depended on formant synthesis for ground truth values of these resonances. Synthesis is easily controlled, but it has many intrinsic assumptions that do not necessarily accurately realize the acoustics in the way that physical resonances would. Here, we show that physical models of the vocal tract with derivable resonance values allow a separate approach to the ground truth, with a different range of limitations. Our three-dimensional printed vocal tract models were excited by white noise, allowing an accurate determination of the resonance frequencies. Then, sources with a range of fundamental frequencies were implemented, allowing a direct assessment of whether RS avoided the systematic bias towards the nearest strong harmonic to which other analysis techniques are prone. RS was indeed accurate at fundamental frequencies up to 300 Hz; above that, accuracy was somewhat reduced. Future directions include testing mechanical models with the dimensions of children's vocal tracts and making RS more broadly useful by automating the detection of resonances.


Assuntos
Voz , Criança , Humanos , Acústica , Acústica da Fala , Vibração , Espectrografia do Som
18.
Acta Otorhinolaryngol Ital ; 44(1): 27-35, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38420719

RESUMO

Objective: The aim of this study was to compare the efficacy of voice therapy combined with standard anti-reflux therapy in reducing symptoms and signs of laryngopharyngeal reflux (LPR). Methods: A randomised clinical trial was conducted. Fifty-two patients with LPR diagnosed by 24 h multichannel intraluminal impedance-pH monitoring were randomly allocated in two groups: medical treatment (MT) and medical plus voice therapy (VT). Clinical symptoms and laryngeal signs were assessed at baseline and after 3 months of treatment with the Reflux Symptom Index (RSI), Reflux Finding Score (RFS), Voice Handicap Index (VHI) and GRBAS scales. Results: Groups had similar scores at baseline. At 3-month follow-up, a significant decrease in RSI and RFS total scores were found in both groups although it appeared to be more robust in the VT group. G and R scores of the GRBAS scale significantly improved after treatment in both groups, with better results in the VT group. The VHI total score at 3 months improved more in the VT group (VHI delta 9.54) than in the MT group (VHI delta 5.38) (p < 0.001). Conclusions: The addition of voice therapy to medications and diet appears to be more effective in improving treatment outcomes in subjects with LPR. Voice therapy warrants consideration in addition to medication and diet when treating patients with LPR.


Assuntos
Refluxo Laringofaríngeo , Voz , Humanos , Refluxo Laringofaríngeo/diagnóstico , Refluxo Laringofaríngeo/tratamento farmacológico , Projetos Piloto , Inibidores da Bomba de Prótons/uso terapêutico , Qualidade da Voz
19.
PeerJ ; 12: e16904, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38371372

RESUMO

Background: The ability to differentiate familiar from unfamiliar humans has been considered a product of domestication or early experience. Few studies have focused on voice recognition in Felidae despite the fact that this family presents the rare opportunity to compare domesticated species to their wild counterparts and to examine the role of human rearing. Methods: We tested whether non-domesticated Felidae species recognized familiar human voices by exposing them to audio playbacks of familiar and unfamiliar humans. In a pilot study, we presented seven cats of five species with playbacks of voices that varied in familiarity and use of the cats' names. In the main study, we presented 24 cats of 10 species with unfamiliar and then familiar voice playbacks using a habituation-dishabituation paradigm. We anticipated that human rearing and use of the cats' names would result in greater attention to the voices, as measured by the latency, intensity, and duration of responses regardless of subject sex and subfamily. Results: Cats responded more quickly and with greater intensity (e.g., full versus partial head turn, both ears moved versus one ear twitching) to the most familiar voice in both studies. They also responded for longer durations to the familiar voice compared to the unfamiliar voices in the main study. Use of the cats' name and rearing history did not significantly impact responding. These findings suggest that close human contact rather than domestication is associated with the ability to discriminate between human voices and that less social species may have socio-cognitive abilities akin to those of more gregarious species. With cats of all species being commonly housed in human care, it is important to know that they differentiate familiar from unfamiliar human voices.


Assuntos
Felidae , Voz , Humanos , Animais , Cuidadores , Projetos Piloto , Reconhecimento Psicológico/fisiologia , Voz/fisiologia
20.
PLoS Comput Biol ; 20(2): e1011849, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38315733

RESUMO

Sleep deprivation has an ever-increasing impact on individuals and societies. Yet, to date, there is no quick and objective test for sleep deprivation. Here, we used automated acoustic analyses of the voice to detect sleep deprivation. Building on current machine-learning approaches, we focused on interpretability by introducing two novel ideas: the use of a fully generic auditory representation as input feature space, combined with an interpretation technique based on reverse correlation. The auditory representation consisted of a spectro-temporal modulation analysis derived from neurophysiology. The interpretation method aimed to reveal the regions of the auditory representation that supported the classifiers' decisions. Results showed that generic auditory features could be used to detect sleep deprivation successfully, with an accuracy comparable to state-of-the-art speech features. Furthermore, the interpretation revealed two distinct effects of sleep deprivation on the voice: changes in slow temporal modulations related to prosody and changes in spectral features related to voice quality. Importantly, the relative balance of the two effects varied widely across individuals, even though the amount of sleep deprivation was controlled, thus confirming the need to characterize sleep deprivation at the individual level. Moreover, while the prosody factor correlated with subjective sleepiness reports, the voice quality factor did not, consistent with the presence of both explicit and implicit consequences of sleep deprivation. Overall, the findings show that individual effects of sleep deprivation may be observed in vocal biomarkers. Future investigations correlating such markers with objective physiological measures of sleep deprivation could enable "sleep stethoscopes" for the cost-effective diagnosis of the individual effects of sleep deprivation.


Assuntos
Privação do Sono , Voz , Humanos , Sono , Qualidade da Voz , Vigília
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...